home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
kermit.columbia.edu
/
kermit.columbia.edu.tar
/
kermit.columbia.edu
/
newsgroups
/
misc.20000114-20000217
/
000048_news@columbia.edu _Mon Jan 17 19:25:56 2000.msg
< prev
next >
Wrap
Internet Message Format
|
2000-02-16
|
5KB
Return-Path: <news@columbia.edu>
Received: from newsmaster.cc.columbia.edu (newsmaster.cc.columbia.edu [128.59.59.30])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id TAA24210
for <kermit.misc@watsun.cc.columbia.edu>; Mon, 17 Jan 2000 19:25:56 -0500 (EST)
Received: (from news@localhost)
by newsmaster.cc.columbia.edu (8.8.5/8.8.5) id TAA28343
for kermit.misc@watsun.cc.columbia.edu; Mon, 17 Jan 2000 19:06:58 -0500 (EST)
X-Authentication-Warning: newsmaster.cc.columbia.edu: news set sender to <news> using -f
From: fdc@watsun.cc.columbia.edu (Frank da Cruz)
Subject: Case Study #10: Atomic File Movement
Date: 18 Jan 2000 00:06:57 GMT
Organization: Columbia University
Message-ID: <860ar1$rlj$1@newsmaster.cc.columbia.edu>
To: kermit.misc@columbia.edu
Today let's look at the common situation in which files must be moved from
one computer to another for processing on a regular basis. For example,
daily business receipts are sent from a branch office or franchise to
company headquarters, or medical or pharmaceutical insurance claims from a
doctor's office, hospital, or pharmacy to a claims clearinghouse. Each file
contains a series of financial transactions, so we need to ensure that each
transaction occurs once and only once, and when it occurs, it occurs
completely and correctly. Of course other applications can be imagined too.
Let's call the two parties "Branch" and "Headquarters" (HQ). In a typical
scenario, Branch collects files (e.g. from each operator station) into a
directory and then transmits them every evening to HQ. The connection can
be made by traditional (non-PPP) dialup or by network. Of course Kermit is
equally suited to both. (That's a strong point of Kermit, remember? For
example, if you normally use a network connection but the net is broken,
you can fall back up old-fashioned dialup using the same script if it is
well-designed.)
The procedures for making the connection are well documented in the Kermit
manuals. Let's assume we have a connection already, we have already
authenticated or logged in, and there is a Kermit server on the far end.
Let's also assume that our current directory on the local computer contains
the files we need to send, and there are many of them. Of course we can
just tell the local Kermit to "SEND *.*" or whatever, but what happens if
the connection breaks and we have to start again? We don't want HQ to
receive multiple copies of the same transaction. (Obviously there should
be other safeguards but we won't discuss them here.)
There are several approaches to this problem, but the best one is Kermit's
new "atomic file movement" feature. In this case "atomic" is used in the
computer-science sense, not the physics one :-) The command is simple:
SEND /DELETE *.*
This means, send all the files whose names match "*.*" (or any other
pattern or filename) and delete each one as soon as, and only if, it was
sent successfully (MOVE is a synonym for SEND /DELETE). Alternatively, you
can use:
SEND /MOVE-TO:xxxx *.*
which, instead of deleting each successfully sent file, moves it to the
directory named xxxx. (A third choice, SEND /RENAME-TO:, is described
in the update notes.)
Now if the connection is lost, you can make a new connection and give the
same SEND /DELETE or SEND /MOVE-TO command again, and it sends only the
files that were not already sent successfully, because the ones that were
are gone.
Meanwhile, back at Headquarters we encounter the classic conundrum: how to
know when a file has been completely uploaded? Let's suppose some process
at HQ (besides Kermit) waits for new files to appear in the upload
directory. Well, each file "appears" as soon as it is opened, but it might
be open for some time while the Kermit receiver is writing new material to
it (the same is true, of course, for FTP). We don't want to start
processing it until it has arrived completely, but we also don't want to
wait forever.
Here again, atomic file movement is the answer. If the Kermit server at HQ
is given the command:
SET RECEIVE MOVE-TO xxxx
(where xxxx is the name of a directory), this tells it to move each
received file to the specified directory after, and only if, it is received
successfully. So the script to start up the server at HQ might look like
this:
cd /incoming/tmp/
set receive move-to /incoming/ready/
server
exit
The underlying API is chosen to be atomic; for example the UNIX rename()
system call is used (or link() when rename() is not available); the instant
the file appears in the /incoming/ready/ directory, it's ready to use and
not in the middle of being copied. And it won't come back to haunt you
again after processing, because the Branch won't upload it again.
As for making sure the files get through despite repeated disconnections,
see the 'deliver' script on page 453 of "Using C-Kermit" or in the C-Kermit
script library:
ftp://kermit.columbia.edu/kermit/scripts/ckermit/deliver
For details about atomic file movement, see Sections 4.0.8, 4.1.3, 4.7 of
the ckermit2.txt file.
- Frank